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(54) Method and apparatus for forecasting and controlling congestion in a data transport 
network 

(57) A method and apparatus for controlling con- 
gestion at a node in a data network. The node includes 
an input for receiving traffic units from the network, an 
output for releasing traffic units to the network and a 
control unit. The control unit is responsible for estimat- 
ing a level of data occupancy of at least a portion of the 
network by looking at the traffic units received at the < — 
input from a remote node in the network. When the data 
occupancy level reaches a certain threshold, the node 
issues a control signal to the remote node such that the 
remote node lowers its rate of traffic units input in the 
network. By estimating the network data occupancy 
level, congestion at the node can be effectively foreseen 
and controlled. 
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Description 

[0001] The present invention relates to the field of 
digital data transmission. More specifically, it pertains to 
a method and apparatus for forecasting and controlling 
congestion within a data transport network, such as a 
standard routed network or an add/drop packet net- 
work. 

[0002] Within a data transport network, it is desira- 
ble that bandwidth sharing between the network trunks 
be well managed so as to avoid congestion. In the sim- 
plest form, this means that trunks sharing a section of 
the physical transport medium should all be able to get 
a reasonable share of the bandwidth at that point. 
[0003] Prior art mechanisms for avoiding conges- 
tion within a data network include a centralized man- 
agement scheme or a complex system for interchange 
of information implemented within the transport net- 
work. One such system for interchange of information is 
equivalent to the bidding for space on the common 
transport medium by the various network nodes, where 
the bidding performed by a particular node is based on 
the amount of data traffic at that particular node. The 
amount of data traffic may be evaluated on the basis of 
queue length at the particular node. Unfortunately, such 
schemes do not cater to the fact that the data sources 
(nodes) are themselves adaptive and that queue 
lengths at the inputs to the transport medium are not an 
indication of potential demand. This reduces the effec- 
tiveness and accurateness of these congestion mecha- 
nisms, such that congestion may still exist within the 
transport network. 

[0004] Within a standard routed network, it is typical 
for a router to buffer all of its through traffic along with its 
local traffic. Buffer fill triggers packet loss, which in turn 
signals data sources to slow down. Another existing 
mechanism for implementing congestion control 
involves the monitoring of the average buffer fill such 
that discard may be effected before the buffer overflows. 
Unfortunately, such data buffering causes important 
latency within the network, as well as high-speed stor- 
age costs and loss of data at the router. 
[0005] There exists a need in the industry to provide 
an improved mechanism for controlling congestion 
within a data network. 

[0006] According to a first aspect of the invention, 
there is provided a node for use in a data network, said 
node comprising: 

an input for receiving traffic units from a first remote 
node; 

an output for releasing traffic units to a second 
remote node; 

a control unit coupled to said input for estimating a 
data occupancy level of at least a portion of the 
data network based at least on a rate of traffic units 
passing from said input to said output, when the 
data occupancy level reaches a certain threshold 



said control unit being operative to generate a con- 
trol signal instrumental to cause a reduction in the 
data occupancy level. 

5 [0007] The node of the invention enables a level of 
data occupancy of at least a portion of the network to be 
estimated by looking at the traffic units received from a 
remote node in the network. The level of data occu- 
pancy is representative of the amount of data being car- 

10 ried by the network portion being evaluated and reflects 
the level of congestion in the network portion of interest. 
By estimating the network data occupancy level, con- 
gestion at the node can be effectively foreseen and 
appropriate action taken in an attempt to avoid it or at 

75 least limit it. 

[0008] In a specific example, the node evaluates 
the data occupancy level of a certain portion of the net- 
work and compares it against a threshold. This thresh- 
old is dynamic and varies on the basis of the rate of 

20 release from that node of traffic units input in the node 
from a local source. When the threshold is exceeded, 
the node issues a control signal that is sent to the 
remote node. The control signal is a congestion stamp 
placed into a certain traffic unit before its release from 

25 the node into the network. The control signal is a notifi- 
cation to the remote node to reduce the output of traffic 
units into the network. 

[0009] The traffic units in the data network may be 
either user data packets, control packets or compound 

30 packets having a user data part and a control part. The 
user data packets and the user data parts of the com- 
pound packets carry mostly user pay load data, such as 
speech samples, video samples or other. The control 
packets and control parts of the compound packets 

35 carry control information, such as source and destina- 
tion identifiers, control sequence numbers and reverse 
direction acknowledgements. In a specific example, the 
traffic units used to evaluate the data occupancy level 
within the network are control packets. 

40 [0010] The present invention also encompasses a 
method for controlling the congestion at a node in a data 
network. The method comprises the steps of estimating 
a data occupancy level of at least a portion of the data 
network based at least on a rate of traffic units passing 

45 from through the node, and taking an appropriate action 
in an attempt to reduce congestion (if congestion 
exists). 

[0011] Other aspects and features of the present 
invention will become apparent to those ordinarily 
so skilled in the art upon review of the following description 
of specific embodiments of the invention in conjunction 
with the accompanying figures, in which: 

Figure 1 is a block diagram of an example of a ring- 
55 based data transport network; 

Figure 2 is a block diagram of a transport node 
shown in Figure 1 , in accordance with an embodi- 
ment of the present invention; 



2 



: <EP 



1061698A2_I_> 



EP 1 061 698 A2 



Figure 3 is a functional block diagram of the trans- 
port node shown in Figure 2; 

Figure 4 is a flowchart illustrating the operation of a 
program element in the transport node depicted in 
Figure 2, which implements the congestion assess- 5 
ment and operation of the transport node; 
Figure 5 illustrates a mechanism for forecasting 
congestion, in accordance with an embodiment of 
the present invention. 

w 

[001 2] In a specific example the present invention is 
implemented in a data transport network featuring a 
ring-based transport medium. Figure 1 illustrates a typ- 
ical ring-based transport network 100, where the trans- 
port ring 102 interconnects a plurality of nodes 104, 15 
106, 108, 110, 112 and 114. Each node includes an 
input/output pair corresponding to one direction of the 
ring 102, where the input is for receiving traffic units 
from the ring 102 and the output is for releasing traffic 
units to the ring 102. Each connection between an out- 20 
put of one node and an input node of another, remote 
node is defined as a trunk. Note that the endpoints of 
such a trunk may be referred to as sender (originator of 
the data be ing sent over the trunk) and receiver (desti- 
nation of the data being sent over the trunk). For exam- 25 
pie, the connection between output 118 of node 104 
(sender) and input 132 of node 112 (receiver) is a trunk 

A, while the connection between output 1 56 of node 1 08 
(sender) and input 1 42 of node 114 (receiver) is a trunk 

B. Since the transport ring 1 02 is a medium that is com- 30 
mon to ail of the nodes, the trunks formed between 
these nodes must share the total bandwidth available 
over the transport ring 102. 

[0013] As shown in Figure 2, each of the transport 
ring 102 nodes generally includes a control unit 200 and 35 
a storage unit 202; assume for this example that we are 
looking at node 104. The control unit 200 includes a 
memory 204 and a processor 206, and is responsible 
for controlling the flow of traffic units inserted by the 
node 104 onto the transport ring 102. Control unit 200 40 
further implements a congestion control mechanism 
based on the amount of traffic being carried by the 
transport ring 102, such that congestion at the node is 
foreseen and avoided or at least reduced, as will be 
described in further detail below. In this specific exam- 45 
pie of implementation, the congestion control mecha- 
nism is implemented by software executed by the 
processor 206. The storage unit 202 includes a plurality 
of buffers (queues) for receiving and storing data arriv- 
ing at node 104 from local sources, where the traffic 50 
from these buffers is to be transported over respective 
trunks of the transport ring 102. The storage unit 202 is 
the actual physical storage facility where traffic units are 
handled. Although the memory 204 is also a physical 
storage device, it is used primarily for control purposes. 55 
This distinction is not critical to the present invention 
and an embodiment where a single storage medium is 
provided that combines the memory 204 and the stor- 
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age unit 202 can clearly be envisaged. 

[0014] The traffic units in the data network may be 
either user data packets, control packets or compound 
packets having a user data part and a control part. The 
user data packets and the user data parts of the com- 
pound packets carry mostly user pay load data, such as 
speech samples, video samples or other. The control 
packets and control parts of the compound packets 
carry control information, such as source and destina- 
tion identifiers, control sequence numbers and reverse 
direction acknowledgements. Note that in addition to 
user pay load data a user data packet may contain some 
form of control element, for example an identifier repre- 
sentative of a companion control packet. 
[0015] The memory 204 of the control unit 200 
includes two queues 208 and 210, hereafter referred to 
as real buffer 208 and virtual buffer 210. The real buffer 
208 receives traffic units from the various local buffers 
of the storage unit 202, and provides a temporary stor- 
age mechanism for holding all traffic units for insertion 
onto the transport ring 102 until space is available on 
the ring 102. The virtual buffer 210 is used by the control 
unit 200 to determine whether congestion is or will be 
experienced by the node 1 04, and has an effective fill 
which is equivalent to the amount of space available on 
the ring 1 02, or data occupancy level on the ring 1 02 for 
receiving traffic from the node 104. The functionality of 
the virtual buffer 210 will be described in further detail 
below. The physical configuration of buffers 208 and 
210 does not need to be described in detail because 
such components are readily available in the market- 
place and the selection of the appropriate buffer mech- 
anism suitable for use in the present invention is well 
within the reach of a person skilled in the art. The mem- 
ory 204 also supports a TCP-like adaptive window for 
use by the control unit 200, as will be described in fur- 
ther detail below. 

[0016] The memory 204 further contains a program 
element that regulates the congestion control mecha- 
nism of the node 104. The program element is com- 
prised of individual instructions that are executed by the 
processor 206, for evaluating the link occupancy of the 
transport ring 102 and for reducing the likelihood of con- 
gestion at the node. This program element will be 
described in further detail below. 

[0017] A conventional IP network implements band- 
width sharing among host machines using the Transport 
Control Protocol (TCP). Although data flow in the net- 
work can be bi-directional, it is usual to refer to the orig- 
inator of a particular piece of data as the sender and the 
other end as the receiver. In TCP, the sender (sender 
host machine) constantly tests the network to see if 
more bandwidth is available and uses the loss of a 
packet determined by sequence numbers of TCP pack- 
ets as an indication to decrease its rate. The general 
characteristic of TCP is that it is self-clocking. That is to 
say, the sender will wait for an acknowledgement from 
the receiver for the packets already sent before sending 
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more packets. If the sender waited for each individual 
packet to be acknowledged than the maximum rate that 
the connection could achieve would be one packet per 
round trip time of the connection. To increase the send- 
ing rate while keeping the self -clocking nature of the 
protocol, the sender is allowed to send some number of 
packets while waiting for an earlier packet to be 
acknowledged. This number of packets is called the 
window. The receiver itself may constrain the size of the 
window in order to limit its buffer requirement. 

[0018] The current size of the window is called the 
congestion window and can vary between one packet 
and the maximum that the receiver is prepared to 
accept. As the sender receives acknowledgements, the 
window slides forward and also increases in size. An 
increase in size allows the connection to run faster. If a 
packet is not acknowledged it will eventually be consid- 
ered lost and this loss is assumed to be a result of con- 
gestion at some merge point. The sender, in addition to 
re-transmitting the packet, will reduce the window size. 
The slow and gradual increase in window size then 
begins again. 

[0019] An embodiment of the present invention 
uses a mechanism to implement congestion control at a 
node within a data transport network, where this mech- 
anism has some common points to the TCP process 
and is thus referred to as a TCP-like mechanism. Spe- 
cifically, the mechanism comprises the use of an adap- 
tive window scheme, such as that used in TCP, where 
the adaptive window scheme controls the real data 
transmission rates. The data occupancy level of the net- 
work is estimated and regularly updated, such that it 
may be used to adjust the window size. 
[0020] In a particular example of implementation of 
the present invention, the data network 100 implements 
a control overlay concept, whereby data control is 
detached from the user data itself. Uniike TCP where 
control information is embedded in the data packets, the 
control overlay concept separates the control informa- 
tion from the user data. Specifically, for every user data 
packet sent over the transport ring 1 02, there is a corre- 
sponding control packet sent separately by the data 
control system, which itself emulates the topology of the 
data network. Note that this emulation of the data net- 
work topology may be effected by using the same phys- 
ical path of the ring 1 02 as that used by the data stream. 
Alternatively, the control packets could use a physical 
path of the ring 102 that is separate from that used for 
transporting the actual user data, as long as the control 
packets travel in reasonable synchronism with the user 
data packets. Taking for example the trunk between 
sender 1 04 and receiver 1 12, for every user data packet 
sent by the sender 104 to the receiver 112, the control 
unit 200 will generate and send a control packet over 
the ring 102 towards the receiver 112, thus emulating 
the trunk. Alternatively, the user data packets and con- 
trol packets may be merged to form a compound packet, 
where data control is embedded into the actual user 



data stream, as in the case of TCP. The control packet, 
in a specific example, is a predefined sequence of bit 
fields containing, but not limited to: a busy/idle indicator 
for the ring slot of the corresponding data unit; source 
5 (sender) and destination (receiver) node identifiers; 
control sequence numbers; congestion notification; and 
reverse direction. 

[0021] Assume hereinafter that both control pack- 
ets and user data packets form the body of traffic units 

10 transiting through the transport ring 102, data control 
information being separate from the user data stream 
itself. Each node in the ring-based transport network 
independently assesses the data occupancy level in the 
network and implements a congestion control mecha- 

15 nism in response to this data occupancy level, in partic- 
ular if the data occupancy level signals the presence of 
congestion at the particular node. Specific to the 
present invention, the control unit 200 of node 104 is 
operative to detect and foresee congestion at the node 

20 104, in response to which it will generate a control sig- 
nal. In the situation where the node 104 is experiencing 
congestion, this control signal is effective to reduce the 
level congestion. In the situation where the node 104 
will be experiencing congestion in the future, this control 

25 signal is effective to reduce the likelihood of congestion 
developing at the node 104. 

[0022] In a specific example, the control signal gen- 
erated by the control unit 200 takes the form of a con- 
gestion stamp applied to a control packet released from 

30 the node 104 to the transport ring 102. Specifically, 
each control packet has a congestion notification field. 
As control packets are released in the network, this field 
is set to a default value "not congested". An intermedi- 
ate node on the path of the trunk followed by a control 

35 packet can apply a congestion stamp to the control 
packet by setting the bits in the congestion notification 
field of a control packet to "congested", thus indicating 
that congestion is being experienced or is being fore- 
casted at the intermediate node. 

40 [0023] At the receiving end of the trunk, the receiver 
will check control packets for this congestion stamp and, 
if detected, will pass it back to the sender using an out- 
going control packet travelling in the reverse direction 
over the transport ring 102. More specifically, upon 

45 receiving a control packet over the transport ring 1 02, a 
transport node will check the destination node identifier 
(receiver address) stored in a predetermined field of the 
control packet. If the destination node identifier corre- 
sponds to that particular transport node, the control 

50 packet will then be checked for a congestion stamp. 
Upon detection of such a stamp, the source node iden- 
tifier (sender address) will be read from the control 
packet and the congestion stamp transmitted back to 
the sender using an outgoing control packet traveling in 

55 the direction of the sender. 

[0024] Note that the receiver is a node that is the 
intended recipient of the user data packet associated 
with the control packet. In that sense the receiver is dif- 
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ferent from the intermediate nodes since it generates an 
acknowledgement for the user data packet (the outgo- 
ing control packet mentioned above) to signal to the 
sender that the information has been correctly received. 
Such acknowledgement provides a convenient mecha- 
nism to transfer back to the sender the congestion 
stamp acquired by the control packet during its transit 
through one or more intermediate nodes. Thus, when 
the sender of the control packet receives the acknowl- 
edgment (control packet issued by the receiver), it is 
notified that congestion exists or is developing in the 
network. In response to this congestion stamp, the 
sender can reduce its rate of release of traffic units into 
the network in order to reduce the congestion. 
[0025] Typically only a small percentage of the con- 
trol packets are marked with congestion stamps. This 
isolation between the control information and the actual 
data stream accommodates implementations such as 
optical networks where the data control system can be 
run on a low speed system while the real data can 
exploit high speed, fixed-length parallel transfers. Other 
implementations provided for include packet-based sys- 
tems with variable length packets. The size of the data 
packet can be freely chosen as long as it allows for con- 
trol information (control packets) to be received often 
enough to suit the goal of the control loop (i.e. the round 
trip control loop timing of the system). Note that config- 
uring different data packet sizes for different trunks is 
one way to bias the sharing of available bandwidth 
between the trunks. 

[0026] When the sender 104 receives a control 
packet marked with a congestion stamp from the 
receiver 112, it will reduce the size of the TCP-like adap- 
tive window, thus reducing its data-sending rate. The 
sender 104 is aware of the round trip control time for the 
trunk and need only react to one congestion stamp in 
that time period. In a preferred embodiment of the 
present invention, the adaptive window control algo- 
rithm implemented by the transport ring nodes is based 
on the above-described TCP model of multiplicative 
decrease and additive increase. Specifically, the sender 
104 will progressively increase its data-sending rate 
until a congestion stamp is received, at which point it will 
reduce its data sending rate. In the absence of further 
congestion stamps, the sender 104 will again start pro- 
gressively increasing its data-sending rate. This algo- 
rithm will not be described in further detail, as it is well 
documented and well known to those skilled in the art. It 
should be noted that there are many alternative algo- 
rithms for use in implementing the adaptive window con- 
trol algorithm, also included within the scope of the 
present invention. 

[0027] In a specific example of implementation, the 
transport ring 102 is a slotted ring, wherein each slot on 
the ring 102 represents a user data packet Generally, a 
slotted ring's data control system involves the use of an 
information "header", travelling in parallel with the user 
data packet and carrying basic information such as 



whether the slot is in use and, if in use, the destination 
node (receiver) to which it is being sent. This header 
may also carry the control packet that can be marked at 
any node to indicate congestion. 

5 [0028] As it was generally discussed earlier, the 
control unit 200 of a node assesses congestion by first 
determining the data occupancy level of the transport 
ring 1 02. This data occupancy level is then compared to 
a threshold level, where the threshold level is dynamic 

10 and varies on the basis of the rate of release of data 
packets from the real buffer 208 (local sending rate). 
Figure 3 is a functional illustration of the transport node 
104, specifically intended to depict how the data occu- 
pancy level is established and how congestion is fore- 

15 casted. The control unit 200 checks the incoming 
information headers at a monitoring point 300 and, for 
each slot, updates a history of slot status maintained in 
memory 204. The history of slot status maintained in 
memory 204 includes the number of busy slots passing 

20 the monitoring point as well as the number of available 
slots passing the monitoring point, where these varia- 
bles are reset upon the expiration of one round trip time 
of the transport ring 102. When updating the history in 
memory 204, the control unit 200 treats each idle or pre- 

25 viously marked slot as available and all others as busy. 
Note that a marked slot will result in at least one availa- 
ble slot in the next period. Thus, this history reveals the 
data occupancy level of the transport ring 102, that is 
the amount of data being carried by the transport ring 

30 1 02. The number of available slots occurring in a period 
equal to one round trip time for the ring is used to pro- 
duce an effective "fill" of the virtual buffer 210 in memory 
204, where having no available slots is equivalent to 
having 100% fill. 

35 [0029] If the virtual buffer 210 fill is below the 
threshold level, such that the number of available slots 
on the transport ring 102 is adequate to handle all the 
traffic that the node 1 04 wants to send in that period, 
then there is no congestion. If the number is not ade- 

40 quate, and the fill is above the threshold level, then con- 
gestion notification must be invoked, by marking an 
outgoing control packet with a congestion stamp at a 
marking point 302. Note that the available slot require- 
ment for transport node 104, that is the threshold level, 

45 is based on the current sending rate of the node 1 04, or 
on a future projection of this sending rate. The earlier 
described adaptive window control algorithm ensures 
that the TCP-like adaptive window growth is very gentle 
so that the number of available slots seen during one 

50 round trip time of the transport ring 1 02 can be used to 
project the number available in the next round trip time. 
[0030] The threshold level is dynamic in that it 
reflects the amount of local data that the node is desir- 
ous of inputting to the network. The more local data 

55 there is to be released by the node 1 04 to the transport 
ring 102, the lesser the data occupancy level needed to 
trigger the congestion control mechanism. Conse- 
quently, the threshold level decreases with an increased 
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amount of local data for release by the node 104 to the 
transport ring 102. 

[0031] While implemented separately from the 
above-described congestion assessment operation, it 
should be noted that if an information header is indica- 5 
tive of an idle slot, the contro! unit 200 generates and 
inserts a control packet into the information header for 
transmission over the transport ring 102. In addition, a 
data packet from the real buffer 108 is inserted into the 
corresponding parallel idle slot for transmission over the w 
transport ring 102. 

[0032] Figure 4 provides a complete flowchart illus- 
trating an example of the operation of the program ele- 
ment stored in the memory 204 of the control unit 200, 
and executed by the processor 206, that regulates the 15 
congestion control mechanism of the transport node 
1 04. At step 402, the control unit 200 monitors the infor- 
mation headers (control traffic) passing through trans- 
port node 104. For each information header, the control 
unit 200 checks to see if the header is marked as being 20 
idle or not and updates the history maintained in mem- 
ory 204 accordingly, at step 404. Based on this history, 
the number of slots available in a period equal to one 
round trip time of the ring 102 is determined and the 
effective fill of virtual buffer 21 0 is updated at step 406. 25 
At step 408, the control unit 200 assesses its conges- 
tion situation, using the fill of virtual buffer 210 (data 
occupancy level) and the threshold level, itself based on 
the current data sending rate of the transport node 104. 
If the space required on the transport ring 102 based on 30 
the current data sending rate is greater than the availa- 
ble space on the transport ring 1 02, such that the effec- 
tive fill of virtual buffer 210 is above the threshold level, 
an outgoing control packet is marked with a congestion 
stamp at step 410. 35 
[0033] In order to ensure that the transport ring 1 02 
is operated at close to 100% usage for maximum effi- 
ciency, a mechanism may be used to gradually increase 
the congestion marking probability, such as is used in 
Random Early Detection (RED). In a particular embodi- 40 
ment, the 1 00% fill of virtual buffer 21 0 is equated to the 
maximum threshold of RED (MAXth), while the mini- 
mum threshold (MINth) is calculated by subtracting the 
projected local requirement from MAXth, as shown in 
Figure 5. Also, as in RED, the buffer fill can be calcu- 45 
lated as a weighted average over several round trip 
times. Since RED is well documented and known to 
those skiJIed in the art, it will not be described in further 
detail. 

[0034] Taking for example the slotted ring, the con- 50 
trol loop round trip time is identical for all transport 
nodes and bandwidth sharing between nodes is gener- 
ally quite fair. In other add/drop networks, some correc- 
tion factors may be required to ensure fair sharing or, as 
suggested earlier, to deliberately bias sharing toward 55 
particular trunks. The sharing properties may be biased 
by adjusting round trip times or by allocating some 
trunks larger data units. In a particular example of a slot- 



ted ring network, the sender node for a particular trunk 
could be allowed to insert non-modifiable control pack- 
ets with the basic data packets. If one normal control 
packet is sent for every three non-modifiable control 
packets then the trunk has an effective data packet of 
four times the basic data packet, thus providing the 
trunk with a greater share of the bandwidth over the 
transport medium. 

[0035] In an alternative embodiment, the above- 
described control overlay concept and use of a virtual 
buffer to perform congestion assessment at a transport 
node may be implemented within a standard routed net- 
work. Specifically, at each network router congestion 
may be foreseen and reduced without causing network 
latency due to data buffering. Rather than buffering all 
data flowing through the network and using the buffer fill 
to trigger packet loss, the virtual buffer having a fill rep- 
resentative of the transport medium data occupancy 
level may be used to implement the congestion control 
mechanism. Further, the transport medium data occu- 
pancy level may be determined by monitoring the con- 
trol system, itself de-coupled from the data transport 
system. 

[0036] The above description of a preferred embod- 
iment under the present invention should not be read in 
a limitative manner as refinements and variations are 
possible without departing from the scope of the inven- 
tion, as defined in the appended claims. 

Claims 

1. A node for use in a data network, said node com- 
prising: 

an input for receiving traffic units from a first 
remote node; 

an output for releasing traffic units to a second 
remote node; 

a control unit coupled to said input for estimat- 
ing a data occupancy level of at least a portion 
of the data network based at least on a rate of 
traffic units passing from said input to said out- 
put, when the data occupancy level reaches a 
certain threshold said control unit being opera- 
tive to generate a control signal instrumental to 
cause a reduction in the data occupancy level. 

2. A node as defined in claim 1, wherein said traffic 
units are selected from the group consisting of user 
data packets, control packets and compound pack- 
ets including a user data part and a control part. 

3. A node as defined in claim 1 or 2, wherein said con- 
trol signal is directed to a remote node issuing traf- 
fic units towards said input. 

4. A node as defined in claim 1, 2 or 3, wherein said 
input is a first input, said node further comprising a 
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second input for receiving traffic units from a local 
source for transmission to a remote node in the net- 
work. 

5. A node as defined in claim 4, wherein said certain 5 
threshold is dynamic and varies on the basis of the 
rate of release from said node of the traffic units 
received at said second input. 

6. A node as defined in claim 5, wherein said traffic io 
units received from a local source for transmission 

to a remote node in the network are user data pack- 
ets. 

7. A node as defined in claim 5 or 6, wherein said con- 15 
trol signal includes a congestion stamp placed into 

a certain traffic unit released from said output. 

8. A node as defined in claim 7, wherein said certain 
traffic unit includes a source identifier and a desti- 20 
nation identifier, the destination identifier designat- 
ing a downstream remote node of the network, the 
source identifier designating an upstream remote 
node of the network, upon reception of the certain 
traffic unit at the downstream remote node said 25 
congestion stamp being transmitted to the 
upstream remote node. 

9. A node as defined in claim 8, wherein said control 
unit is responsive to a congestion stamp received 30 
from a remote node in the network and associated 

to a user data packet originating from said node to 
reduce a rate of release from said node of the traffic 
units received at said second input. 

35 

10. A node as defined in claim 9, wherein said control 
unit is operative to progressively increase a rate of 
release from said node of the traffic units received 
at said second input until a congestion stamp is 
received from a remote node in the network. 40 

1 1 . A node as defined in claim 1 0, wherein each of said 
traffic units received at said first input is either one 
of a user data packet and a control packet, the con- 
trol packets being associated with respective use 45 
data packet and being transmitted separately from 

the user data packets in the network. 

12. A node as defined in claim 1 1 , wherein said control 
unit estimates a data occupancy level of at least a so 
portion of the data network based at least on a rate 

of control packets passing from said first input to 
said output. 

13. A node as defined in claim 9, wherein the conges- 55 
tion stamp is carried by a control packet received 
from a remote node in the network. 



12 

14. A method for controlling congestion at a node in a 
data network, said node comprising: 

an input for receiving traffic units from a first 
remote node; 

an output for releasing traffic units to a second 

remote node; 

said method comprising: 

a) estimating a data occupancy level of at 
least a portion of the data network based 
at least on a rate of traffic units passing 
from said input to said output; 

b) when the data occupancy level reaches 
a certain threshold generating a control 
signal instrumental to cause a reduction in 
the data occupancy level. 

15. A method as defined in claim 14, wherein said traf- 
fic units are selected from the group consisting of 
user data packets, control packets and compound 
packets including a data part and a control part. 

16. A method as defined in claim 14 or 15, wherein said 
control signal is directed to a remote node issuing 
traffic units towards said input. 

17. A method as defined in claim 14, 15 or 16, wherein 
said input is a first input, said node further compris- 
ing a second input for receiving traffic units from a 
local source for transmission to a remote node in 
the data network. 

18. A method as defined in claim 17, wherein said cer- 
tain threshold is dynamic, said method comprising 
varying said threshold on a basis of a rate of 
release from said node of traffic units received at 
said second input. 

19. A method as defined in claim 1 7, wherein the traffic 
units received at said second input are user data 
packets. 

20. A method as defined in claim 17, wherein the con- 
trol signal includes a congestion stamp placed into 
a certain traffic unit released from said output. 

21. A method as defined in claim 20, wherein said cer- 
tain traffic unit includes a source identifier and a 
destination identifier, the destination identifier des- 
ignating a downstream remote node of the network, 
the source identifier designating an upstream 
remote node of the network, upon reception of the 
certain traffic unit at the downstream remote node 
the congestion stamp being transmitted to the 
upstream remote node. 

22. A method as defined in claim 21 , said method corn- 
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prising reducing a rate of release from said node of 
the traffic units received at said second input in 
response to a congestion stamp received from a 
remote node in the network. 

5 

23. A method as defined in claim 22, said method com- 
prising progressively increasing a rate of release 
from said node of the traffic units received at said 
second input until a congestion stamp is received 
from a remote node in the network. 10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



8 

BNSDCCID: <EP 1061698A2 I > 



EP 1 061 698 A2 




140 


162 




104 


116 


118 



100 



146 134 



112 13 



148 




130 


128 




110 


150 


152 




Figure 1 



BNSDOCID: <EP 1061698A2_I_> 



EP 1 061 698 A2 



MO- 



II 



104 



202 

Storage Unit 



Buffers 



200 

Control Unit 



204 

Memory 



208 

Real Buffer 



210 

Virtual Buffer 



206 

Processor 



-118- 



Figure 2 



10 



BNSDOCID: <EP_ 



_1061698A2J_> 



EP 1 061 698 A2 




t 



TCP Sender Rate Control 
with shaping to current average* Space required 




Look for congestion 



based on current 
sending rate 

Virtual Buffer 210 

I! 1 "! J'"| Virtual 
■ 1 I I ■ 1 1 1 1| Buffer Rll 

i 'JL '-L 1 JJi L j-- ju' i t 



Local traffic 
incoming 

through traffic 

■o^ — 



Real Buffer 208 \jf 

Space available^| 
based on I 
Traffic History X Insert when 

jdle slot 




Add congestion 
indication 





Ring traffic in 



Monitoring Point 
300 



Marking Ring traff ic out 
Point 302 



Figure 3 



11 



BNSDOCID: <EP_ 



_1061698A2_I_> 



EP 1 061 698 A2 



11 



f 400 N 

V Start J 



402 



Monitor through traffic for slot status 



404 



Update history of slot status 



406 

Update effective fill of virtual buffer 210 



408 



Effective fill > threshold level 



410 



Mark outgoing control packet with 
congestion stamp 



Figure 4 



EP 1 061 698 A2 




Local arrivals 



MAXth MINth Current "fill" 

___4___i__ 

1 1 II II II 1 1 II I llll 1 1 




L l _L ! U LL»_L« U 1L> ii L. 



a) Non-congested 



Local arrivals 



Current "fill" 



MAXth 




MINth 



1 1 1 1 1 




Mill 



b) Congested 



Figure 5 



1061698A2_L> 



13 



« • 
* 



*! 



THIS PAGE BLANK (uspto) 



(19) 




Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 



(11) 



EP 1 061 698 A3 



(12) 



EUROPEAN PATENT APPLICATION 



(88) Date of publication A3: 

04.09.2002 Bulletin 2002/36 



(51) im ci 7 : H04L 12/56, H04L 12/43 



(43) Date of publication A2: 

20.12.2000 Bulletin 2000/51 

(21) Application number: 00305032.5 

(22) Date of filing: 14.06.2000 



(84) Designated Contracting States: 


(72) 


Inventors: 


AT BE CH CY DE DK ES Fl FR GB GR IE IT LI LU 


• 


Chapman, Alan Stanley John 


MC NL PT SE 




Ontario K2K1V5 (CA) 


Designated Extension States: 


• 


Kung, Hsiang-Tsung 


AL LT LV MK RO SI 




Lexington, Massachusetts 02173 (US) 


(30) Priority: 15.06.1999 US 333269 


(74) 


Representative: Ertl, Nicholas Justin 






Elkington and Fife, 


(71) Applicant: Nortel Networks Limited 




Prospect House, 


Montreal, Quebec H2Y 3Y4 (CA) 




8 Pembroke Road 






Sevenoaks, Kent TN13 1XR (GB) 



(54) Method and apparatus for forecasting and controlling congestion in a data transport network 



CO 
< 
CO 

a> 

CO 
CD 



LU 



(57) A method and apparatus forcontrolling conges- 
tion at a node in a data network. The node includes an 
input for receiving traffic units from the network, an out- 
put for releasing traffic units to the network and a control 
unit. The control unit is responsible for estimating a level 
of data occupancy of at least a portion of the network by 
looking at the traffic units received at the input from a 
remote node in the network. When the data occupancy 
level reaches a certain threshold, the node issues a con- 
trol signal to the remote node such that the remote node 
lowers its rate of traffic units input in the network. By 
estimating the network data occupancy level, conges- 
tion at the node can be effectively foreseen and control- 
led. 



•140- 



104 



202 

Storage Unit 



Buffers 



4-162- 



200 
Control Unit 



204 

Memory 



208 
Real Buffer 



210 

Virtual Buffer 



206 
Processor 



-118- 



Rgure 2 



Printed by Jouve, 75001 PARIS (FR) 



BNSDOCID: <EP 



1061698A3_I_> 



EP 1 061 698 A3 




European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 00 30 5032 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate, 
of relevant passages 



Relevant 
to claim 



CLASSIFICATION OF THE 
APPLICATION (lnt.C1.7) 



US 5 377 327 A (JAIN RA0ENDRA K ET AL) 
27 December 1994 (1994-12-27) 

* column 4, line 43 - column 5, line 25 * 



The present search report has been drawn up for all claims 



1-4, 

14-17, 

19-23 

5-13,18 



H04L12/56 
H04L12/43 



TECHNICAL RELDS 
SEARCHED <lrrt.ct.7) 



H04L 



CM 



n 

o 



CC 

£ 

o 

ID 



Place o< search 



THE HA6UE 



Date ol completion of the search 

15 July 2002 



Examiner 

Strobeck, A 



CATEGORY OF CITED DOCUMENTS 

X : particularly relevant if taken alone 

Y : particularly relevant if combined with another 

documeni of the same category 
A : technological background 
O : non -written disclosure 
P : intermediate doriiment 



T : theory or principle underlying the invention 
E : earlier patent document, but published on, or 

after the "Ulng date 
D : document cited in the application 
L : document cited for other reasons 

& : member of the same patent family, correspond ng 
document 



2 



BNSDOCID: <EP 



106 1698 A3 l_> 



EP 1 061 698 A3 



ANNEX TO THE EUROPEAN SEARCH REPORT 
ON EUROPEAN PATENT APPLICATION NO. 



EP 00 30 5032 



This annex lists the patent family members relating to the patent documents cited in the above-mentioned European search report. 
The members are as contained in the European Patent Office EDP file on 

The European Patent Office is in no way liable tor these particulars which are merely given for the purpose of information. 

15-07-2002 



Patent document 
cited in search report 


Publication 
date 


Patent family 
member(s) 


Publication 
date 


US 5377327 A 


27-12-1994 


US 


5491801 A 


13-02-1996 






US 


5675742 A 


07-10-1997 






us 


5668951 A 


16-09-1997 



sL 



o 

0. 



Sj For more details about this annex : see Official Journal of the European Patent Office, No. 12/82 



BNSDOCID: <EP 



1061698A3_I_> 



*1 



THIS PAGE BLANK (uspto) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 



U BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

* 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




THIS PAGE BLANK (usm* 



